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While there is a rich literature on transcription dynamics during the development of many organisms, 
protein data is limited. We used iTRAQ isotopic labeling and mass spectrometry to generate the largest 
developmental proteomic dataset for any animal. Expression dynamics of nearly 4,000 proteins of Xenopus 
laevis was generated from fertilized egg to neurula embryo. Expression clusters into groups. The cluster 
profiles accurately reflect the major events that mark changes in gene expression patterns during early 
Xenopus development. We observed decline in the expression of ten DNA replication factors after the 
midblastula transition (MBT), including a marked decline of the licensing factor XCdc6. Ectopic expression 
of XCdc6 leads to apoptosis; temporal changes in this protein are critical for proper development. 
Measurement of expression in single embryos provided no evidence for significant protein heterogeneity 
between embryos at the same stage of development. 

Xenopus laevis has a long history as a model organism for studying early vertebrate development. Landmark 
advances in cell, molecular, and developmental biology using this animal include nuclear transfer experi- 
ments that demonstrated the totipotency of the nucleus from a differentiated cell', the first isolation of a 
gene from any organism^, the first complete nucleotide sequence of a gene\ and the purification of the first 
eukaryotic transcription factor''. 

Xenopus has been especially important for studies of early vertebrate development because in vitro fertilization 
yields synchronized embryos that mature outside the mother, so that embryogenesis can be easily monitored in 
real time. Fate maps for organ development have been determined and major regulators of these processes have 
been identified and characterized, providing an abundance of tissue- and organ-specific markers to track embryo 
formation. Development of embryos is rapid (the gastrula stage is reached 9 hours post fertilization and a nearly 
fully developed nervous system at 4 days). The relatively large size of the egg (>1.2 mm) allows for facile 
microinjection of material of all types and for microsurgery. 

A number of techniques have been used to monitor changes in transcription expression during early stages of 
development in Xenopus^'". These studies have revealed a number of subtle mechanisms by which the organism 
controls protein expression. Only maternal mRNAs are present at fertilization and during the first seven stages of 
development. Post-transcriptional regulation controls protein translation, both by the destruction of mRNA and 
by activating "masked" or dormant mRNA through polyadenylation. Zygotic transcription begins at the mid- 
blastula transition (MBT - developmental stage 8); only then does the embryo begin to translate its own mRNA 
into protein. 

Although there is a rich literature on transcript analysis, the analysis of Xenopus protein expression changes 
during development is much more poorly documented. Many researchers have applied western blotting and 
other traditional methods to study protein expression changes during development' '^. However, these methods 
are only able to characterize expression changes for a small number of proteins and are limited to the available 
antibodies. 

Large-scale qualitative and quantitative proteomic technologies have matured over the past two decades""'. 
The vast majority of proteomic studies employ a bottom-up protocol, wherein proteins are digested to peptides. 
These peptides undergo one or more stages of chromatographic separation, are detected using electrospray 
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ionization-tandem mass spectrometry, and are identified through 
database searching algorithms" '". A number of quantitative proteo- 
mic techniques have been developed based on stable isotope label- 
ing"'^""^''. Isobaric tags for relative and absolute quantitation 
(iTRAQ) is an isotopic labeling method that produces protein iden- 
tification from peptide fragments and protein quantitation from low 
mass reporter ions at the MS/MS leveP^. It has several benefits. First, 
it can analyze up to eight pools of peptides in a single analysis 
(iTRAQ Splex)^'', significantly speeding comparisons of protein 
abundance across many samples. Second, iTRAQ labeling does not 
increase the complexity of parent ion spectra because the mass 
increase produced by each label is the same. Since its introduction, 
iTRAQ has been widely used for quantitative proteomics^^'^''. 

There have been only a few reports determining protein express- 
ion changes in model systems of animal developmental biology*^, 
including Drosophila^", Danio^'^, and Xenopus^"'^^ embryos and 
mouse oocytes'*^. In this work, we employed iTRAQ-8plex chemistry 
to monitor protein expression kinetics of Xenopus laevis embryos at 
six early stages of development. We generated quantitative data on 
the expression changes of nearly 4,000 proteins during four or more 
stages of Xenopus development. These are by far the largest Xenopus 
proteomic dataset, and the largest dataset on developmental proteo- 
mics for any organism. All the data related to this work was deposited 
in Peptide Atlas^', http://www.peptideatlas.org/PASS/PASS00436. 

The mRNA expression of roughly 5,000 genes has been detected in 
Xenopus laevis early development (stages 2-33)". Our observation of 
expression changes of nearly 4,000 proteins suggests that our dataset 
includes a large fraction of the proteome of the organism at early 
stages of development. 

This work included single zygotes in two iTRAQ experiments. The 
resulting datasets are by far the largest protein expression studies 
performed on single cells. 

Results and discussion 

Deep proteome analysis of Xenopus laevis embryos. We performed 
three independent iTRAQ experiments on Xenopus laevis embryos 
during early development. Micrographs of embryos at the stages of 
development employed in these experiments are shown in Fig. lA, 
and the experimental design is shown in Fig. IB. Experiments I and II 
each employed single embryos and 8plex chemistry. In Experiment I 
(El), single embryos at stages 1, 5, 8, and 11 were used; two embryos 
from each stage were analyzed in separate iTRAQ channels as 
biological duplicates. In Experiment II (E2), single embryos from 
stages 1, 5, 13, and 22 were used; again, two embryos from each 
stage were analyzed as biological duplicates. To provide more 
material for analysis and to average any embryo-to-embryo hete- 
rogeneity in protein expression, experiment III (E3) employed 
homogenates of four embryos each at stages 1,8, 13, and 22, which 
were labeled using four channels of 8plex chemistry. Following 
digestion with trypsin and subsequent labeling using iTRAQ 
chemistry, the pooled peptides from each of the experiments were 
separated into 20 fractions by strong cation exchange liquid 
chromatography, followed by UPLC-ESI-MS/MS analysis with a 
Q-Exactive mass spectrometer. The .raw files were converted to 
.mgf files with RAW2MSM software'''' using the default settings, 
followed by ProteinPilot™ 4.5 (AB Sciex) analysis with database 
searching as "Thorough". Because RAW2MSM software filters out 
relatively low signal intensity fragment ions in tandem spectra, 
peptide identification is improved. 

Roughly 1 million MS/MS spectra were acquired over 150 hours in 
the three experiments, identifying an average of 3,403 ±157 proteins 
per experiment, with a protein-level false discovery rate (FDR) < 1% 
(Supplementary Fig. SIA). The union of the three experimental 
results generated a total of 4,455 non-redundant protein features, 
which is the largest Xenopus laevis proteome dataset generated to 
date. The mRNA from ~ 5,000 genes has been detected at early stages 



of X. laevis development". Our dataset captures proteins translated 
from 90% of the genes listed in the published mRNA dataset of 
Xenopus embryos at these early developmental stages. 

When we filtered the peptide identification with 95% confidence 
(peptide-level FDR as -0.5%), 23,618 (El), 21,956 (E2), and 29,422 
(E3) distinct peptides were identified; the union of three experiments 
produced 36,977 peptides with confidence higher than 95%. Peptides 
with 95% or higher confidence are candidates for use in single/mul- 
tiple reaction monitoring experiments (SRM/MRM); the large pep- 
tide datasets generated in this work can be used as a library for further 
targeted proteomic analysis''\ SRM/MRM analysis can replace west- 
ern blotting, which is particularly valuable for high throughput 
quantitation of selected protein and for quantifying proteins for 
which no antibody is available. Protein and peptide lists are provided 
in supplementary material II. 

We analyzed the proteins using the DAVID Bioinformatics 
Resource 6.7^''. The official gene symbols were obtained for 4,065 
proteins; 390 putative proteins have not yet been annotated. We 
determined the biological process, molecular function, and cellular 
localization of these proteins (Supplementary Fig. SI B-D). The 
highly enriched biological processes (Supplementary Fig. SIB) are 
related to DNA replication, transcription, translation, energy gen- 
eration, and post-translational operations (i.e. protein transport, 
protein localization, and protein folding) . The highly ranked molecu- 
lar functions (Supplementary Fig. SIC) are related to nucleotide, 
nucleoside, and RNA binding; helicase activity; ATP binding and 
ATPase activity; ribosome structural constituents; and protein trans- 
porter activities. Biological processes and molecular functions reflect 
cell division, transcription, and protein expression and organization 
during the early stages of development. Cellular localization 
information of the proteins (Supplementary Fig. SID) mirrors the 
biological processes; many proteins are associated with ribonucleo- 
protein (RNP) complexes, ribosome, mitochondrion, nuclear pore, 
Golgi-associated vesicle, translation initiation factors, and proton- 
transporting ATP-synthase complex. The DAVID bioinformatics 
results are also included in supplementary material II. 

Expression kinetics of 3,983 proteins during early Xenopus deve- 
lopment. Nearly 90% of the identified proteins were also quan- 
tified in the three iTRAQ experiments. The high quantitation 
percentage is a result of high iTRAQ labeling efficiency and 
high specificity (100% at lysine and 99.33% at N-termini for the 
first 20,000 peptide matches per experiment). There were no 
identified labels on Tyr, His, Ser, or Thr residues in the 20,000 
peptide sets. Quantitation was further enhanced by the Q-Exactive 
mass spectrometer, which generated high intensity iTRAQ reporter 
ion signals (Supplementary Fig. S2). For charge 2 and 3 parent ions, 
the iTRAQ reporter ions were quite intense (Supplementary Fig. 
S2A); the median reporter ion signal intensities range from 5.2E + 
04 (channel 116) to 9.6E + 04 (channel 121) for El data. Sup- 
plementary Fig. S2B, which helps ensure high quality quantitation 
estimates. 

2,889 (El), 2,948 (E2), and 3,259 (E3) proteins were quantified in 
the three independent iTRAQ experiments, with identification FDRs 
of less than 1%. The union of the results for the three experiments 
yields quantitation data for 3,983 proteins, where each protein gen- 
erated expression information at four or more embryonic stages 
(supplementary material III). This is by far the largest database of 
protein expression kinetics during the embryonic development of 
any organism. We were able to calculate pValues for more than 
80% of our reported quantitations. Using a target-decoy approach 
by comparing ratios on duplicate iTRAQ channels, the calculated 
FDR for our quantitation data is less than 20%. 

Biological replicates provide insight on the precision of the 
protocol. El and E2 generate an iTRAQ replicate based on the 
ratios between stage 5 and stage 1 from the two experiments 
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Figure 1 | Experimental design. Micrographs of Xenopus laevis embryos at developmental stages used for ITRAQ measurements (A). Design and 
workflow of three independent iTRAQ experiments (B). Three experiments were performed. In experiment 1 (El), two embryos at stage 1, two embryos 
at stage 5, two embryos at stage 8, and two embryos at stage 1 1 were separately lysed and digested with trypsin. The first embryo at stage 1 was labeled with 
the iTRAQ reagent channel 113, the second embryo at stage 1 was labeled with the iTRAQ reagent channel 1 14, the two embryos at stage 5 were labeled 
with the iTRAQ reagents channels 115 and 116, the two embryos at stage 8 were labeled with the iTRAQ reagent channels 117 and 118, and the two 
embryos at stage 11 were labeled with the iTRAQ reagent channels 119 and 121. These labeled peptides were pooled, subjected to strong cation exchange 
chromatography fractionation, and each fraction was analyzed using reversed-phase liquid chromatography and detection with a Q-Exactive mass 
spectrometer. Tandem mass spectra were analyzed both to identify the peptide and to quantitate the abundances of each peptide from each of the eight 
embryos. A similar procedure was performed in experiment 2 (E2), except that the biological duplicates consisted of single embryos taken from stages 1, 
5, 13 and 22. Finally, experiment 3 (E3) employed four pools of four embryos, where each pool was taken embryos at stages 1, 8, 13 and 22. 
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Figure 2 | Cluster analysis of quantified proteins. Proteins with significant change of abundance are grouped into one of six clusters according to the 
changes in expression as a function of developmental stage. Log2(protein abundance ratio) from experiment 3 (E3) was used for cluster analysis. The 
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ratios of 1.2 and 0.8. 



(Supplementary Fig. S3A). The log2 ratio distribution is very narrow 
with a standard deviation of 0.19. El and E3 generate another bio- 
logical replicate based on stage 8 embryos; E2 and E3 yield two more 
replicates based on stages 13 and 22. The excellent correlations 
between those replicates (slope 0.91-0.99; r 0.96-0.99) further con- 
firm the method's reproducibility (Supplementary Fig. S3B). 

The distributions of ITRAQ ratios are plotted in Supplementary 
Fig. S4. In all cases, the means of the log2 protein ratio distributions 
are very close to 0, which demonstrates that total protein abundances 
vary little during early development. These results are consistent with 
our BCA data (Supplementary Table SI). The protein abundance 
ratio distributions become broader at later stages (stage 11 (El) 
and 22 (E2, E3)) as more proteins undergo a significant expression 
change. 

Cluster analysis. We performed cluster analysis based on abundance 
changes during development for all the quantified proteins from the 
three experiments (E1-E3), segregating proteins into those with no 
significant expression change (cluster 0), and into six others that 
showed significant changes (clusters 1-6). Fig. 2 presents the 
clustering results from experiment E3; a total of 3,259 proteins 
were quantified in this experiment. Biological process information 



for the proteins in clusters 1-6 is presented in Supplementary Fig. S5. 
Results for El and E2 are presented in Supplementary Fig. S6-S9, and 
are also listed in supplementary material 111. 

Proteins with expression ratios higher than 1.2 or lower than 0.8 
are considered as significantly up- and down-regulated, correspond- 
ing to log2 values of regulation thresholds as 0.26 and — 0.32, respect- 
ively. 2,028 proteins showed no significant expression changes in 
experiment E3 (cluster 0), whereas 1,231 proteins were recognized 
as significantly up- or down-regulated during early development 
(clusters 1-6). 

The cluster profiles accurately reflect the three major events that 
mark changes in gene expression patterns during early Xenopus 
development. Owing to the rapid cell divisions that occur immedi- 
ately following fertilization, protein synthesis in early stage embryos 
relies on maternal transcripts that accumulated during oogenesis. 
After 12 cell divisions, the cell cycle lengthens and zygotic transcrip- 
tion begins during stage 8 at what is known as the midblastula trans- 
ition (MBT). Many mRNAs synthesized in the oocyte are dormant or 
"masked" and become activated by cytoplasmic polyadenylation 
upon hormone-dependent maturation of the egg or upon fertiliza- 
tion''^. Organ and system specific proteins appear following the tran- 
scriptional reprogramming that occurs at stage 13. 
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Figure 3 | Expression levels of DNA replication factors and histones. The expression levels of several DNA replication factors and histones change at the 
mid-blastula transition. (A) Xcdc6 exhibits a marked decline after the mid-blastula transition relative to other replication factors. (B) Levels of Cdc6 
protein measured by western blot for control embryos and embryos injected with Cdc6 mRNA (1 ng). (C) Overexpression of Cdc6 triggers increased 
levels of apoptosis. Apoptosis was detected by staining late blastula (stage 9) embryos with PSS-380 for the fluorescent detection of phosphatidylserine. 
Left panel is water-injected control embryo; center panel is Cdc6 mRNA (1 ng) injected embryo; right panel is embryo incubated with 0.1 i^M 
staurosporine to induce apoptosis. (D) Levels of maternal histone HI (H1M/B4) decline while levels of adult HI and core histones increase. Data from 
experiments El and E3 were used to generate (A) and (D), and the error bars for stage 8 were based on the results from experiments El and E3. 



Clusters 1 and 2. Proteins in clusters 1 and 2 are down-regulated 
during early development. These proteins are maternal in origin and 
their functions are mostly limited to oocytes and early stage embryos. 
The difference between these two groups is the time at which levels 
begin to decline. Proteins in cluster 1 decrease immediately after 
fertilization; whereas, those in cluster 2 remain constant up to the 
MBT and then decline. Proteins found in cluster 1 include those that 
bind to and mediate translational repression of maternal mRNAs in 
Xenopus oocytes such as the Y box proteins (frgy 2 a/b) and Zygote 
arrest protein'""" as well as proteins such as Maskin and RAP55 that 
are constituents of the protein complex that binds to the cytoplasmic 
polyadenylation element and controls activation of masked mRNA 
through lengthening the poly(A) tail'"''"'. Since this mechanism of 
translational control is limited to oocytes, we can easily rationalize 
the loss of these proteins following fertilization. 

Recently, it was demonstrated that the lengthening of the cell cycle 
and the resulting asynchronous cell division that begins at the MBT is 
due, at least in part, to a decline in the amounts of four replication 
factors (Cuts, RecQ4, Treslin, and Drfl) at the MET"'. 
Overexpression of these four proteins in Xenopus embryos resulted 
in a two-fold increase in DNA replication that could be correlated 
with increased initiation. We, likewise, detect a decrease in the 
amounts of RecQ4 and Cut5 at the MBT that places them in cluster 
2; Treslin and Drfl were not detected, most likely due to their low 
abundance. However, we also observe a much more pronounced 
decline during this period in the level of the replication licensing 
factor, XCdc6 (Fig. 3A). We confirmed the changes in XCdc6 by 
western blot (Fig. 3B). Cdc6 acts early in the initiation of replication 



by binding to the origin recognition complex and recruiting Mcm2-7 
to form the licensing or prereplicative complex; the other four factors 
are active later during kinase-dependent formation of the replisome. 

We questioned whether the marked decline in XCdc6 also 
accounts for lengthening of the cell cycle at the MBT. Injection of 
mRNA encoding XCdc6 into one-cell embryos results in elevated 
levels of the factor that persist beyond stage 13 (Fig. 3B). However, 
overexpression of Xcdc6 did not lead to a measureable increase in 
replication (results not shown). Caspase cleavage of Cdc6 can gen- 
erate truncated forms of the factor that are incorporated into the 
prereplicative complex, but cannot support the recruitment of the 
Mem complex". This incorporation, in turn, activates p53-mediated 
apoptosis. We asked whether overexpression of the licensing factor 
could be triggering a similar response in Xenopus embryos. Notably, 
the combined overexpression of Cut5, RecQ4, TresMn, and Drfl 
inhibits gastrulation and also leads to cell death at stage 17*'. 
Embryos injected with XCdc6 mRNA were compared to water 
injected controls and to embryos incubated with the pro-apoptotic 
drug, staurosporine*'. Stage 9 embryos were incubated with a stain 
(PSS-380) for phosphatidylserine, an early marker of apoptosis 
(Fig. 3C). Embryos overexpressing XCdc6 exhibit an appreciable 
amount of apoptosis that would mask detection of any increase in 
replication. The immediate effect of XCdc6 overexpression relative 
to the other replication factors may reflect their different roles 
(licensing vs. initiation) in replication or the means through which 
they signal replication defects. Notwithstanding this point, iTRAQ 
accurately identified a critical change in the amounts of a licensing 
factor that triggers apoptosis when perturbed (stage 8). 
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We note that two XCdc6 isoforms (XCdc6A and XCdc6B) have 
been identified, and XCdc6A is down-regulated after the MBT*''. 
The two isoforms, which are named as XCdc6 protein and Cdc6B 
protein in the protein database, can be placed into two different 
protein groups in the protein identification lists (supplementary 
material II). However, in the protein quantitation list (supple- 
mentary material III), only XCdc6 protein (isoform A) was con- 
fidently quantified, which is most likely due to the two different 
data analysis pipelines employed for protein identification and 
quantitation in this work. In addition, the XCdc6A expression 
kinetics discovered in this work also agree with results from the 
Coleman group''*. 

Clusters 3 and 5. Clusters 3 and 5 are comprised of over 300 proteins, 
including GIO protein and Statl, encoded by maternal mRNAs that 
become activated upon fertilization. These proteins are classic 
examples of the discordance between transcript and protein ex- 
pression. This discordance arises due to an appreciable amount of 
post-transcriptional regulation that controls expression of maternal 
mRNAs beyond the MBT. 

The notable difference between cluster 3 and 5 is that zygotic 
transcription appears to sustain protein levels in the former; whereas, 
a marked decline in expression of proteins in cluster 5 is clearly 
coincident with the resumption of transcription and is presumably 
due to the decay of maternal transcripts. The diminished protein 
levels in cluster 5 coincide with progression of the embryo into the 
gastrula stage when organization of the three primary germ layers 
occurs, so the reduced levels of proteins in cluster 5 can also be a 
consequence of restricted, tissue-specific expression as more cells 
become differentiated, which is indeed the case for Statl*^. 

Cluster 6. Cluster 6 is by far the largest in our dataset (351 proteins). 
Expression increases at the MBT for proteins in this cluster, which 
would indicate that this heightened expression results from the 
initiation of zygotic transcription at the MBT. For many of these 
proteins, this simple scenario appears correct, ranging from very 
large increases in mRNA levels (e.g., alkaline phosphatase, lin28a) 
to more moderate changes (e.g., fructose-l,6-bisphosphatase, fus, 
YAP, U2 auxiliary factor 2). 

However, comparison of protein and mRNA levels reveal a far 
greater complexity for this cluster that, in some cases, reflects con- 
tributions from both maternal and zygotic transcripts. Unlike clus- 
ters 3 and 5, some members of cluster 6 appear to be expressed 
initially from maternal transcripts whose activation is delayed until 
the MBT. For the BMP signaling agonist. Twisted gastrulation 
(xTsg), maternal transcripts decrease and zygotic transcripts increase 
well after the MBT at late gastrulation'"; yet, we measure a continu- 
ous increase of xTsg protein from the MBT through stage 22. Thus, 
there must be a coordinated transition from one pool of mRNA to the 
other that likely reflects the increasingly restricted, tissue-specific 
expression of the protein following gastrulation. Fibronectin, which 
is also a member of cluster 6, provides a particularly good example of 
delayed activation of a maternal transcript. In accord with our data, 
Lee et al. measured a pronounced increase in the synthesis of fibro- 
nectin beginning around the MBT that is unaffected by inhibition of 
zygotic transcription using cx-amanitin''^, demonstrating that some 
maternal mRNAs are activated even as zygotic transcription is 
initiated. 

Several members of cluster 6 exhibit a disparity between mRNA 
and protein levels that provide additional evidence for post-tran- 
scriptional regulation. In these instances there is little change in 
mRNA levels that can account for the elevated levels of protein 
(i.e., RACl, aldehyde dehydrogenase 1, Ssbp2, Snap29, chromobox 
homolog 3). An especially important lesson from these data is that 
increased protein expression that commences at the MBT should not 
be immediately attributed to zygotic transcription. Regulation of 
maternal transcripts at or even beyond the MBT can be used to 



transition into zygotic gene expression patterns. While there appears 
to be little regulation of protein expression by miRNAs in Xenopus 
oocytes and early embryos'"*, it would not be surprising if some 
members of cluster 6 are controlled by this mechanism as well. 

The amount of RNA-binding protein VgRBP71 shows a pro- 
nounced increase beginning at the MBT. VgRBP71 is associated with 
localized mRNAs in Xenopus oocytes; in somatic cells isoforms of the 
protein function in alternative mRNA splicing, mRNA decay, and in 
transcriptional regulation'*'. The prominent increase during gastru- 
lation (from stage 8 to 13) is confirmed in a western blot analysis of 
staged embryos (Supplementary Fig. SIO). 

Cluster 4. Major changes in gene expression occur at the gastrula- 
neurula transition (stage 13), which immediately precedes 
organogenesis^"'^'. This period should mark the appearance of 
tissue specific proteins, which is clearly reflected in cluster 4 where 
a constant level of protein through the first 15 hours of 
embryogenesis is followed by a marked increase beginning at stage 
13. Proteins in this cluster that reflect the imminent anatomical 
changes include: globin Y, multiple skeletal troponins, skeletal 
myosins, myosin 10 (neural cells), neurolfilament protein, SPARC/ 
osteonectin (cUia cells), and Na+-K+-ATPase (kidney). 

Histone variants. We also identified eight histone variants that show 
a significant expression change during development. Fig. 3D. Two 
variants, HIM (B4 protein) and H2A.X-F"'^ (histone cluster 1, H2aa), 
undergo a similar down-regulation at stage 11. Six histone variants 
(HI A, HIB, H2A, H2B, H3, and H4) show an increase in expression 
beginning at stage 8. These data suggest that H2A.X-F is a maternal 
histone H2 subtype due to its similar expression profile to HIM, 
which has been shown to be a maternal histone HI subtype'^. 
These data also suggest that histone HIM and H2A.X-F are the 
main histone variants before the MBT and that they are gradually 
replaced by HIA, HIB, and H2A, at later stages of development. 
Because HIM binds to nucleosomal core and linker DNA with 
much lower affinity than somatic HI variants"''', replacement of 
HIM with somatic linker histones fixes the position of nu- 
cleosomes on the DNA^^, resulting in chromatin compaction and 
formation of higher-order structure^"". The data support the 
hypothesis that the higher-order chromatin structure established 
during embryogenesis^'' restricts the binding of initiation factors to 
the DNA, limiting origin use^'. 

Heterogeneity of protein expression in single Xenopus laevis 
embryos. Experiments El and E2 employed a single embryo for 
each iTRAQ channel, and biological duplicates were included for 
each embryonic stage. Fig. IB. These results allow us to investigate 
stochastic variation in protein expression between embryos at the 
same stage of development at the single embryo level. 

We first calculated the Expression Ratio for proteins quantitated 
in two embryos in the same experiment at the same stage of develop- 
ment, eq. 1: 



Expression Ratio (stage n) = 



Expression embryo 1 
Expression embryo 2 



where the ratio is calculated for each protein at the n"* stage of 
development. The distributions of log2(Expression Ratio) for 
embryos are presented in Supplementary Fig. Sll. These distribu- 
tions are centered at zero, corresponding to equal average expression 
in the two embryos, at each stage of development. The standard 
deviations of the log2 distributions are —0.25, corresponding to a 
— 15% relative standard deviation in protein expression; embryo-to- 
embryo differences in protein expression at the same stage of 
development are very low in Xenopus laevis. 

The small variation that was observed in the expression of some 
proteins could either be due to noise in the experimental protocol or 
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due to noise in protein expression. If the variation is due to experi- 
mental noise, then the variation should not be correlated between 
experiments El and E2. We used the standard deviation of the abso- 
lute value of log2(Expression ratio) for each protein in each experi- 
ment as our measure of variation (Supplemental figure S12). This 
arithmetic procedure generates the standard deviation of the relative 
change and ensures that expression ratio always has the larger value 
in the numerator. The standard deviations are clustered at very low 
levels and are essentially uncorrected (r = 0.40), which supports the 
interpretation that experimental noise rather than biological pro- 
cesses dominate any embryo-to-embryo variations in protein 
expression. 

Single cell proteomics. Embryos at stage 1 are zygotes that have not 
undergone division; they are single cells. While there have been 
efforts to characterize protein expression in single cells, the data 
presented here for Experiments El and E2 include the largest 
dataset for single cell proteomics^" These amphibian zygotes are 
roughly three orders of magnitude larger than mammalian zygotes 
and six orders of magnitude larger than somatic cells. Significant 
instrumental advances wiU be required to obtain a similar 
proteomic depth on those cells. Recent advances in electrospray 
technology and on-column digestion provide some hope that 
comprehensive single cell proteomics may be extended to those 
cells"-". 

Methods 

Materials and reagents. Bovine pancreas TPCK-treated trypsin, urea, ammonium 
bicarbonate {NH4HCO3), dithiothreitol (DTT), and iodoacetamide (lAA) were 
purchased from Sigma-Aldrich (St. Louis, MO). Acetonitrile (ACN) and formic acid 
(FA) were purchased from Fisher Scientific (Pittsburgh, PA). Water was deionized 
by a Nano Pure system from Thermo scientific (Marietta, OH). iTRAQ 8-plex kits 
were purchased from AB Sciex (Foster City, CA). 

Xenopus laevis was purchased from Nasco (Fort Atkinson, WI). Mammalian CeU- 
PE LB™ buffer for embryo lysis was purchased from G-Biosciences (St. Louis, MO). 
Complete, mini protease inhibitor cocktail (provided in EASYpacks) was purchased 
from Roche (Indianapohs, IN). 

Xenopus laevis embryo culture and collection. All animal procedures were 
performed according to protocols approved by the University of Notre Dame 
Institutional Animal Care and Use Committee. Female Xenopus laevis were induced 
to lay eggs by injection with 600 units of human chorionic (C1063, Sigma-Aldrich, 
St. Louis, MO) 12-15 hours prior to spawning. Testes were isolated from anesthetized 
males at the time of spawning. Eggs and minced testis were combined in 1/3 MMR 
(Marc's Modified Ringers) for fertilization and embryos were maintained at room 
temperature and throughout development. Embryos were collected at different stages 
based on the information from references 63 and 64. Eight eggs were collected 
separately into eight different Eppendorf tubes at stage 1. Then, six embryos were 
collected at stages 5, 8, 11, 13, and 22. A single embryo was collected into one 
Eppendorf tube. 

Embryo lysis and protein sample preparation. Each collected embryo was 
suspended in 50 |jL of mammalian ceU-PE LB™ buffer containing complete protease 
inhibitor. After shaking for 30 s at room temperature, the tubes were sonicated for 
3 min on ice with a Branson Sonifier 250 (VWR Scientific, Batavia, IL) to lyse the 
embryos completely. The lysates were then kept on ice for half an hour. The tubes 
were next centrifuged at 12,000 g for 10 min. Finally, the supernatants were collected 
into fresh Eppendorf tubes and stored at — 80 C before use. 

The embryo lysate supernatants were precipitated with 300 |j,L of cold acetone at 
— 20''C for 6 h. After centrifugation, the supernatants were removed, and another 
300 |j,L of cold acetone was added to each tube to wash the pellets again. After further 
centrifugation, the supernatants were removed and protein pellets were dried at room 
temperature. 

The pellet in each tube was dissolved in 30 |iL of 8 M urea, 100 mM NH4HCO3 
(pH 8.0) buffer via vortex with sonication. After centrifugation, an 8 |,iL aliquot of 
protein solution was taken from each tube and further diluted to 24 |J.L with 100 mM 
NH4HCO3 (pH 8.0), followed by protein concentration measurement using the 
bicinchoninic acid (BCA) method^^. For experiment I (El), two embryos from stage 1, 
5, 8, and 1 1 were used and each embryo lysate was further prepared individually. For 
experiment II (E2), two embryos from stage 1, 5, 13, and 22 were used and each 
embryo lysate was also further prepared individually. For experiment III (E3), the 
mixtures of four embryos from stage 1, 8, 13, and 22 were used for further 
preparation. All the samples were denatured at 37'C for 1 h, followed by protein 
reduction in 20 mM DTT at 56''C for 1 h and alkylation in 50 mM lAA at room 
temperature for 30 min in dark. The treated samples were further diluted four 
times with 100 mM NH4HCO3 (pH 8.0) to reduce the urea concentration to about 



2 M, followed by trypsin digestion at 37 C overnight with trypsin/protein mass ratio 
as 1/25. 

The digests were acidified with formic acid to terminate the tryptic digestion, 
followed by peptide desalting with CIS spin columns (Pierce Biotechnology, 
Rockford, IL) for El and E2 samples, and with Sep-Pak C18 1 cc Vac Cartridge 
(Waters Corporation, Milford, MA) for E3 samples. After lyophilization, the peptides 
were labeled by 'isobaric tags for relative and absolute quantitation' (iTRAQ) 
8-plex reagents according to the manufacturer's protocols (AB Sciex, Foster City)^^ 
with the following minor modifications. The lyophilized digests for El and E2 were 
dissolved in 12.5 |J.L of dissolution buffer, and the digests for E3 were dissolved in 
35 \iL of dissolution buffer. After addition of 50 |iL of isopropanol to each iTRAQ 
reagent vial, 25 |j.L of iTRAQ reagent was added to the digest for El and E2, and an 
entire vial of iTRAQ reagent was used to label the digests for E3. After labeling at 
room temperature for 2 hours, 35 |J.L of 100 mM Tris-HCl buffer (pH 8.0) was added 
to the samples for El and E2 and incubated at room temperature for 40 min to block 
the residual iTRAQ reagents. For E3 samples, 100 jiL of 100 mM Tris-HCl buffer 
(pH 8.0) was used. Then, the labeled samples in each experiment were mixed, and 
three tubes of labeled digests (El, E2 and E3) were obtained. After lyophilization, 
the samples were dissolved in 500 ^iL (El and E2) or 800 |iL (E3) of 2% ACN and 
0.1%FA solution, followed by desalting with Sep-Pak C18 1 cc Vac Cartridge 
(Waters). The digests were lyophilized again, and then redissolved in 250 ]iL of 0.1% 
FA, followed by strong cation exchange (SCX) liquid chromatography fractionation. 

sex fractionation. The labeled samples from three experiments (El, E2 and E3) were 
fractionated by SCX liquid chromatography using a Waters Alliance HPLC system 
(Waters, Milford, MA, USA) at a flow rate of 0.25 mL/min. About 150 |iL of the 
labeled peptide samples was loaded onto an SCX guard column (4.6 mm i.d. X 
12.5 mm length, Agilent Technologies, Wilmington, DE, USA), and then separated 
with a Zorbax 300-SCX column (2.1 mm i.d. X 50 mm length, 5 |j.m particles, 
Agilent Technologies). The mobile phase gradient was generated using buffer A 
(10 mM KH2PO4, 20% ACN, pH 2.85) and buffer B (1 M KCl in A, pH 2.85). The 
samples in 0.1% FA were loaded, followed by 10 min washing with 100% A to remove 
excess iTRAQ reagent. Then, the peptides were separated by a 25 min linear 
gradient from 100% A to 100% B. Finally, the column was washed by 100% B for 
5 min, followed by column equilibration with 100% A. Fractions were collected from 
12 min to 42 min as follows. Eluate from 12 min to 18 min was collected to one 
fraction and from 36 min to 42 min as one fraction. From 18 min to 36 min, the 
eluate was collected as 1 min/fraction. In total, 20 fractions were collected from each 
sample. 

Peptide desalting. The collected SCX fractions were first lyophilized. Then, the 
fractions from El andE2 were redissolved in 30 |j,Lof2% ACN and 0.1% FA, followed 
by peptide desahing with a C18 ZipTip (ZTC18S096, Millipore, Bedford, MA, USA). 
The fractions from E3 were redissolved in 30 or 50 |,iL of 2% ACN and 0.1% FA, 
followed by desalting with a C18 ZipTip (MiUipore) or a C18 spin column (Pierce 
Biotechnology) based on the sample amount in each fraction. The eluates were 
lyophilized and redissolved in 6 )j.L or 8 ]xL (El and E2) and 12 |j.L (E3) of 2% ACN 
and 0.1% FA solution, followed by ultra-performance liquid chromatography 
(UPLC)-ESI-MS/MS analysis. 

UPLC-ESI-MS/MS analysis. A nanoACQUITY UltraPerformance LC® (UPLC®) 
system (Waters, Milford, MA, USA) was used for peptide separation. Buffer A (0.1% 
FA in water) and buffer B (0.1% FA in ACN) were used as mobile phases for 
gradient separation. Peptides were automatically loaded onto a commercial CIS 
reversed phase column (Waters, 100 |j.m X 100 mm, 1.7 |j.m particle, BEH130C18, 
column temperature 40 C) with 2% buffer B for 10 min at a flow rate of 1 |j.L/min, 
followed by 3-step gradient separation, 2 min from 2% to 8%, 114 min to 28% B, 

3 min to 85% B, and maintained at 85% B for 13 min. The column was equilibrated 
for 12 min with 2% B before analysis of the next sample. The eluted peptides from 
the C18 column were pumped through a capillary tip for electrospray, and analyzed by 
a Q-Exactive mass spectrometer (Thermo Fisher Scientific). For each sample, 2 \iL 
of peptides were used for analysis. 

The electrospray voltage was 1.6 kV, and the ion transfer tube temperature was 
280"C. The S-Lens RE level was 50.00. The data acquisition was programmed in data 
dependent acquisition (DDA) mode. A top 12 method was used. Full MS scans were 
acquired in Orbitrap mass analyzer over m/z 380-1800 range with resolution of 
70,000 (m/z 200) and the number of microscans set to 1. The target value was l.OOE + 
06, and maximum injection time was 250 ms. For MS/MS scans, the twelve most 
intense peaks with charge state ^ 2 were sequentially isolated and further fragmented 
in the higher-energy-coUisional-dissociation (HCD) cell following one full MS 
scan. To reduce the interference of peptide co-fragmentation to the iTRAQ 
quantitation, the isolation window was set as 1.0 m/z. The normalized collision 
energy was 33%, and tandem mass spectra were acquired in the Orbitrap mass 
analyzer with resolution 35,000 (m/z 200). The fixed first mass was m/z 100.0. The 
target value was l.OOE + 06 and maximum injection time was 120 ms. The number of 
microscans was 1 and the ion selection threshold was 5.0E + 04 counts. Peptide 
match and exclude isotopes were turned on. Dynamic exclusion was set as 30 s. 

Data analysis. The .raw files were converted to Mascot generic format (.mgf) files via 
RAW2MSM software^* with default settings for deep proteome analysis, and via 
Proteome Discoverer 1.3 (Thermo Fisher Scientific) with default settings for protem 
quantitation analysis. Protein Pilot™ 4.5 (AB Sciex, Foster City, CA, USA) was used 
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for deep proteome analysis and protein quantitation analysis with .mgf files as input. 
Paragon^*^ algorithm (v. 4.5.0.0, 1654)^^ integrated in the Protein Pilot''^ 4.5 was used 
for database searching. The Xenopus laevis database (12/28/2012 version) 
downloaded from the Xenbase website (ftp://ftp.xenbase.org/pub/Genomics/ 
Sequences/}. The Xenopus database was then combined with common contaminants 
and used for database searching. The parameters for database searching were as 
follows: iTRAQ 8plex (peptide labeled) was set as sample type. lodoacetamide was set 
as the cysteine aUcylation, trypsin as the digestion enzyme, Orbitrap MS (1-3 ppm) 
and Orbitrap MS/MS as instrument and urea denaturation as special factors. In 
addition, search effort was set as "thorough" for deep proteome analysis, and "rapid" 
for protein quantitation. The database searching for the reversed database was also 
performed in order to evaluate FDRs at the peptide and protein levels^^'^^. 

For deep proteome and protein quantitation analysis, the peptide confidence was 
filtered to produce a peptide level global FDR of less than 1%. On protein group level, 
the protein unused score was used to filter the protein identification to produce a 
protein level FDR less than 1%. For protein quantitation, only ratios from the spectra 
that were unique to each protein were used for calculation of protein ratio, and 
only "Auto" peptides were used for protein quantitation. Bias correction was applied 
for protein quantitation results, which determines the median average protein 
ratio and corrects it to unity, and then applies this factor to all quantitation results. 
Proteins with iTRAQ ratio higher than 20 or lower than 0.05 were not considered as 
quantified, and only proteins with reasonable ratios across all channels were 
recognized as quantified ones. We obtained the final protein quantitation 
information based on normalization to the mean of channels 113 and 114 (biological 
replicate of stage 1) for El and E2, and the mean of iTRAQ protein ratios of 
biological replicates for each embryonic stage was used for further data analysis. We 
calculated the FDR for our quantitative data using a target-decoy approach by 
comparing ratios on duplicate iTRAQ channels. 

DAVID Bioinformatics Resources 6.7^^ was used to generate the gene symbols of 
the Xenopus laevis proteome, biological process, molecular function and cellular 
component information. 

Open source software, GProX^^ used to visualize protein quantitation data 
including histograms and clustering analysis. The log2 ratios were used for analysis. 
For histograms analysis, default settings were used. For clustering analysis, the 
number of clusters was set to 6, and fixed regulation threshold (upper limit as 0.26 and 
lower limit as —0.32, corresponding to the original ratios of about 1.2 and 0.8) was 
used. The minimal membership for plot was set as 0.5. Other parameters were default 
settings. 

Western blot analysis. To confirm protein quantitation determined by mass 
spectrometry, we also performed western blot analysis for several proteins. Following 
SDS-PAGE, proteins were transferred to a nitrocellulose membrane overnight at 4 C 
and developed with antibody, produced in rabbit, specific to Xenopus laevis VgRBP71 
(generated in the Huber laboratory), actin (AHP1629, AbD Serotec, Raleigh, NC) and 
Cdc6™ (kindly provided by the laboratory of Dr. William G. Dunphy at the California 
Institute of Technology). 

Xcdc6 mRNA injection experiments. One cell embryos were injected with 1 ng 
capped Xcdc6 mRNA or an equivalent volume of H2O and allowed to developed at 
room temperature in 1/3 MMR solution. At stage 9, embryos were incubated for one 
hour in a 200 |j.M solution of PSS-380 (a gift from Dr. Bradley Smith, University of 
Notre Dame), a fluorescent probe that binds to phosphatidyl serine exposed on the 
surface of apoptotic cells. Apoptosis was induced in control embryos by incubation 
with staurosporine (0.1 |j.M) for one hour before staining with PSS-380. Fluorescence 
images were acquired using a Nikon TE-2000U epifluorescence microscope 
equipped with the appropriate UV filter set (ex: 340/80, em: 435/485) for blue 
fluorescence and edited using the Nikon Imaging Software (NIS) Elements v. 4.0. 
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